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Abstract 

Political  science  datasets  contain  information  of  interest  to  plan¬ 
ners  seeking  to  predict  international  relations.  The  goal  of  this  project 
was  to  use  modern  data  mining  techniques  to  determine  whether  such 
data  exists  and,  if  so,  to  characterize  it.  We  developed  a  new  approach 
for  such  analysis  based  on  geometric  harmonics.  At  the  heart  of  our 
approach  is  the  observation  that  such  relationships  are  inherently  non¬ 
linear  and  that  the  data  are  noisy  and  incomplete.  To  demonstrate 
the  power  and  usefulness  of  our  techniques  the  focus  was  on  United 
Nations  voting  data.  It  was  shown  that  major  historical  events  could 
be  inferred  from  these  data;  that  other  (linear)  techniques  did  not  suf¬ 
fice;  and  that  they  could  be  extended  to  understanding  certain  aspects 
of  international  relations.  We  conclude  that  the  project  was  successful 
in  opening  up  the  field  of  “computational  international  relations.” 


1  Introduction 

How  do  the  religious  preferences,  gender,  age,  income  and  place  of  birth  influ¬ 
ence  whether  an  individual  will  likely  engage  in  terrorist  activities?  How  do 
social  and  familial  context  affect  this  estimate?  How  does  the  list  of  countries 
belonging  to  a  particular  intergovernmental  organization  define  the  organi¬ 
zation?  Conversely,  what  does  such  membership  imply  about  the  country? 
More  particularly,  how  might  such  contextual  data  be  codified  and  fused  into 
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a  coherent  global  estimate?  Although  the  above  examples  are  stated  at  dif¬ 
ferent  scales,  they  illustrate  the  questions  facing  data  analysis  in  the  social 
sciences.  Progress  on  answering  questions  such  as  these  from  this  project  are 
reviewed  below. 

At  a  technical  level,  existing  analysis  methods,  or  forms  of  data  imputa¬ 
tion,  are  mainly  either  linear  or  dependent  on  underlyingbut  unknown  and 
perhaps  unknowableprobability  distributions  and  parameterizations.  But  in 
social  situations  data  are  rarely  linearly  distributed  and  sampling  questions 
remain  confounded.  Moreover  the  data  may  be  incomplete;  they  may  be  non- 
veridical;  and  they  may  be  distorted  because  the  subject  in  non-cooperative. 

The  approach  taken  in  this  project  does  not  suffer  from  these  shortcom¬ 
ings.  It  is  non-linear  and  does  not  presuppose  parametric  forms.  Instead  it 
is  based  on  the  observation  that  the  conceptual  structure  in  data  can  often 
be  abstracted  mathematically  as  a  low-dimensional  manifold  embedded  in  a 
high-dimensional  space.  It  is  based  on  the  analysis  of  questionnaire  data.  To 
illustrate:  each  question  is  in  effect  a  separate  measurement;  and  can  be  con¬ 
sidered  as  a  separate  dimension.  While  there  may  be  many  questions  (e.g., 
500  -  1000  or  more)  they  are  rarely  completely  independent  from  one  an¬ 
other.  Thus  information  “implicit”  within  the  questions  exists  even  though 
the  subject  may  believe  it  is  hidden.  It  is  this  implicit  information  that  the 
subjective  analysist  seeks  to  intuit;  and  it  is  these  intuitions  that  have  driven 
the  existing  data  compilations. 

This  approach  has  been  applied  to  trade,  IGO  membership,  conflict  and 
voting  databases.  By  working  in  collaboration  with  political  scientists,  the 
techniques  have  been  refined  so  that  it  is  now  possible  to  derive  embed¬ 
dings  representative  of  a  number  of  political  developments.  Our  experience 
indicates  that  UN  voting  patterns  are  more  predictive  than,  e.g.,  IGO  mem¬ 
berships,  and  that  many  key  events,  such  as  the  development  and  subsequent 
break-up  of  the  Soviet  Union,  can  be  readily  seen.  It  follows,  then,  that  there 
remains  significant  additional  structure  to  be  inferred  from  these  databases 
and  their  integration 

2  Background  and  Preview 

Understanding  the  role,  power,  and  message  of  InterGovernmental  Organiza¬ 
tions  (IGO’s)  remains  a  key  mission  for  political  and  social  scientists.  They 
provide  a  channel  for  information  flow,  and  a  source  of  data  by  which  policy 
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could  be  determined  with  which  international  relations  (IR)  could  be  gov¬ 
erned.  At  the  request  of  former  Program  Manager  Dr.  Lyons,  this  project 
developed  specific  data  mining  techniques  to  reveal  the  information  implicit 
within  United  Nations  General  Assembly  voting  records.  It  was  his  view  that 
understanding  the  structures  of  these  networks  could  shed  light  on  important 
hypotheses  and  theoretical  questions  in  IR  such  as:  (i)  Do  IGO’s  have  any 
influence  on  armed  conflicts?  (ii)  What  do  votes  in  the  UN  General  Assem¬ 
bly  reveal?  (iii)  Does  trade  between  (Democratic)  countries  help  to  reduce 
conflicts  between  them?  His  intuition,  it  turns  out,  was  largely  correct. 

We  preview  our  results  with  Fig.  1.  Although  this  images  are  small,  the 
.pdf  hies  can  be  enlarged  on  your  screen.  But  more  interestingly,  it  is  helpful 
to  view  the  results  rotating  in  3-D;  we  have  provided  a  web  page  for  these  to 
be  viewed  at 

http : //www . cs .yale . edu/homes/vision/zucker/embeddings .html 

The  idea  behind  our  approach,  in  short,  is  to  view  countries  as  points  in 
a  kind-of  galaxy  of  other  countries.  The  galaxy  is  arranged  by  a  similarity 
measure,  in  this  case  based  on  UN  voting  patterns.  Intuitively,  each  country 
is  modeled  by  the  vector  is  its  votes  in  every  issue.  Each  vote  can  be  thought 
of  as  a  kind  of  coordinate,  with  possibilities  for  vote  1  being  YES,  NO, 
Abstain.  Vote  2  is  another  coordinate,  perpendicular  to  the  first,  vote  3 
another  again  perpendicular  to  both,  and  so  on  through  the  nearly  1,000 
votes  taken. 

Of  course,  viewing  points  in  1000-dimensional  spaces  is  impossible,  nor  is 
there  that  much  information  available.  So  the  goal  is  to  reduce  the  dimension 
of  the  space,  while  leaving  the  essential  arrangement  of  the  countries  (the 
galaxy)  effectively  intact.  This  is  done  with  a  technique  called  diffusion 
geometry,  and  it  is  based  on  the  idea  that  countries  are  close  (in  the  galaxy) 
when  they  share  lots  of  political,  social,  trade  and  economic  capital.  (All  of 
these  concepts  are  developed  more  fully  in  the  body  of  this  proposal.) 

Fig.  1  shows  this  dimenion-reduced  galaxy.  This  example  is  chosen  to 
illustrate  how  our  “history  independent”  techniques  can  infer  major  historical 
events  just  from  the  UN  voting  data,  in  this  case  France’s  self-isolation  under 
de  Gaulle’s  presidency.  In  1957  France  (cyan  star,  upper  left  corner)  was 
close  to  the  USA,  UK,  Belgium,  Luxembourg  (blue  markers)  in  the  galaxy  of 
countries.  By  1959,  France,  under  the  influend  of  Charles  de  Gaulle  and  his 
policies,  began  to  withdraw  from  NATO  military  commands.  The  process 
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was  completed  in  1966.  Thus,  when  we  look  at  the  maps  as  time  proceeds, 
we  see  France  slowly  move  to  the  edge  of  the  (blue)  Western  group  in  1960, 
gradually  edging  further  away  by  1963  and  planting  itself  in  a  distant  position 
from  that  of  the  West  in  1967.  French  foreign  policy  returned  to  the  West 
after  de  Gaulle  left  office  in  1969;  notice  France  (cyan  star,  bottom  left) 
moving  back  toward  integration  in  NATO,  its  position  in  1972-1973  got  closer 
and  closer  to  that  of  UKG  (blue  triangle,  top  left)  (FRN  opened  up  from  its 
self-isolation,  allowing  UKG  to  join  EC  in  1973).  Many  more  examples  and 
discussion  of  the  distance  measure  are  contained  in  Sec.  6. 

2.1  Overview  of  Final  Report 

An  overview  of  the  report  is  as  follows.  To  start,  we  review  diffusion  ge¬ 
ometry,  a  non-linear  dimensionality  reduction  technique  based  on  the  con¬ 
cept  of  diffusion  distance,  which  considers  not  only  direct  dyadic  connections 
between  social  actors,  but  also  all  indirect  paths  of  diffusion  through  inter¬ 
mediate  neighbors.  This  is  important  in  political  science  because  influence 
accumulates  in  a  manner  than  is  not  revealed  by  linear  techniques. 

This  technique  has  been  applied  to  socio-political  databases,  such  as  IGO 
membership  [13],  UN  voting  [16].  These  are  described  next.  While  these 
databases  have  received  significant  attention  from  scholars  of  international 
relations  [6,7,10,14]  we  do  not  believe  that  they  have  previously  been  ana¬ 
lyzed  by  techniques  such  as  ours.  Several  papers  do,  in  effect,  support  our 
approach  (  [2,8,11,17]. 

Following  this,  we  review  a  hierarical  clustering  algorithm  that  was  de¬ 
veloped  to  identify  themes  running  through  the  voting  patterns.  This  is,  in 
effect,  the  complement  to  the  above,  because  it  reveals  structure  among  res¬ 
olutions  rather  than  countries.  Taken  together  both  techniques  reveal  how 
much  structure  is  implicit  within  UN  General  Assembly  voting  patterns. 


3  Diffusion  distance 

We  approach  the  dimensionality  reduction  problem  by  means  of  a  social 
network  model:  Consider  G(U,  W )  as  a  network  whose  vertices  i  E  V  are  the 
countries  and  kernel  function  Wij,  derived  from  X ,  measures  the  similarity 
between  countries  i  and  j. 

Social  phenomena  and  trade,  unlike  geography,  follow  a  different  distance 
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(a)  1957 


(b)  1960  (c)  1963 


:  ri'-  • 


(d)  1965  (e)  1967 


(f)  1970 


(g)  1972 


(h)  1973 


(i)  1975 


Figure  1:  De  Gaulle’s  France:  Diffusion  maps  of  UN  voting  pattern  1957- 
1975.  Several  countries  are  marked  for  case  study  identification:  if  (USA), 
A  (UKG),  ★  (FRN),  U  (BEL,  LUX,  GFR),  ★  (RUS).  These  maps  show 
France  started  out  close  to  the  Allies  in  1957.  Then  in  1960,  France,  under 
de  Gaulle’s  presidency,  distanced  its ef  from  the  West.  The  70s  saw  France 
coming  back  toward  the  Western  fold,  once  de  Gaulle  had  left. 
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measure.  Goods  and  social  capital  diffuse  from  one  place  to  another,  per¬ 
haps  through  an  intermediate  country.  Thus  nearby  countries  matter  more 
than  distant  ones.  Since  classical  techniques  preserve  all  pairwise  Euclidean 
distances  between  the  data  points,  we  argue  that  not  all  distances  should  be 
preserved  uniformly.  Instead,  only  short  distances  shoud  be  maintained,  and 
even  attenuated  in  order  to  preserve  the  local  structure,  while  long  distances 
should  not  be  considered  for  keeping.  The  argument  is  illustrated  in  Fig.  2. 
In  political  terms,  we  see  a  polarization  in  which  two  camps  (B,  C )  closely 
communicate,  but  ( A ,  B)  barely  interact  with  each  other  except  through 
intermediary  contacts  located  in  the  middle  tunnel.  An  embedding  which 
highlights  this  polarization  should  tighten  the  clusters’  girth  (thus  attenu¬ 
ating  short  distances )  and  stretch  the  tunnel’s  length  (thus  loosening  long 
distances  and  separating  the  two  clusters  from  each  other).  Those  are  the 
characteristics  of  diffusion.  While  distance  could  have  been  derived  from 
gravitational  potentials  in  [15],  to  our  knowledge  this  is  the  first  application 
of  diffusion  distance  to  sociopolitical  questions. 

Think  of  a  substance  (e.g.  money,  population,  or  political  influence) 
diffusing  from  a  source  point  out  to  its  neighboring  points  in  amounts  pro¬ 
portional  to  the  neighbors’  similarity  to  the  source.  The  substance  continues 
to  diffuse  to  the  neighbors  of  those  neighbors,  etc.  Assuming  a  fixed  amount 
of  substance  in  the  network,  we  can  define  Pt(k\i)  as  the  density  of  substance, 
originating  from  source  point  i,  at  point  k  at  time  t.  Thus  Pt{k\i)  would  be 
high  if  there  are  many  paths  of  length  <  t  connecting  i  to  k,  and  low  other¬ 
wise.  If  we  take  point  i  =  B  on  the  right  of  Fig.  2  as  the  source,  after  t  time 
steps,  most  of  the  substance  originated  from  B  should  end  up  at  points  like 
k  =  C  on  the  right  cluster,  and  only  a  small  fraction  ends  up  at  points  like 
k  —  A  on  the  left,  because  there  are  significantly  more  paths  from  B  to  C 
than  to  A.  The  intuitive  diffusion  distance  [3]  between  any  two  points  i  and 
j  is  a  weighted  difference  between  the  two  probability  density  functions: 

D2t(ij)  =  \\ptm-ptm\\i 

=  J 2(Pt(k\i )  -  pt(k\j))2uj(k)  W 

k 

where  ca(»)  is  the  weight  function  that  normalizes  the  distance  according  to 
the  density  estimate  of  each  vertex. 
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.A 


Figure  2:  Two  tight  clusters  separated  by  a  narrow  path.  It  is  obvious  that 
there  are  many  paths  between  any  pair  of  nodes  from  the  same  cluster  (B  and 
C),  while  there  are  significantly  fewer  paths  between  any  pair  of  nodes  from 
different  clusters  (A  and  B). 

International  trade  can  also  be  viewed  as  a  diffusion  process  in  which 
money  diffuses  from  country  to  country.  The  polarization  in  Fig.  2  can  be 
described  in  terms  of  trade  during  the  Cold  War.  Assuming  the  trade  pat¬ 
tern  stays  constant,  the  money  will  diffuse  out  to  the  two  sources’  trading 
partners,  like  ’bumps’  of  heat  diffusing  through  a  graph.  Thus  pt(»\USA) 
will  be  high  in  the  West,  and  low  in  the  East,  while  pt(»\U  S  S  R)  behaves  in 
the  opposite  direction.  The  function  pt(»\USA)  provides  a  notion  of  “trad¬ 
ing  sphere”  of  the  USA.  Therefore,  the  diffusion  distance  between  the  USA 
and  the  USSR  can  be  defined  as  the  difference  between  their  corresponding 
spheres  pt(»\USA)  and  pt(»\USSR),  as  described  by  Eq.  1. 

4  Random  walk 

In  order  to  compute  the  diffusion  distance  ZT(i,  j),  which  takes  into  account 
all  paths  (of  length  t)  between  i  and  j,  we  begin  by  considering  a  random  walk 
of  a  traveler  in  a  network  of  countries  G{V,W).  The  transition  probability 
is  given  by 


M  =  D~lW 


(2) 


where  D  is  a  diagonal  matrix  D =  di  =  fT-j  W ]j,  called  the  degree  matrix. 
The  matrix  M  =  DXI2MD~XI2  =  D~l/2W  D~1/2  is  thus  symmetric  and  has 
the  same  spectrum  as  M .  If  pt{i)  denotes  the  probability  the  traveler  appears 
in  country  i  at  time  t,  then 


pJ+l=pjM  =  p^D-1W 


(3) 
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Let  Ai  >  A2  >  . . .  >  Xn  be  the  eigenvalues  of  M  and  {vk}  their  corresponding 
orthonormal  eigenvectors: 


M  =  TATt  (4) 

where  A  is  the  diagonal  matrix  with  { A^}  on  its  diagonal,  and  T  is  a  matrix 
whose  columns  are  the  corresponding  eigenvectors  { vk }. 

Therefore 


M  =  D~1/2MD1/2  =  D~1/2TATtD1/2  =  TA4>t  (5) 

where 

(j)k  =  D1/2vk 
=  D~1/2vk 

which  implies  that  {4>k}  and  {'ipk}  dehned  in  Eq.  6  are  the  left  and  right  eigen¬ 
vectors  of  M  corresponding  to  eigenvalues  {A*,}.  Since  {vk}  are  orthonormal 
vectors,  fa  and  ipj  are  bi-orthonormal: 

(7) 

We  can  also  verify  that 

Md1/2  =  D~1/2WD~1/2d1/2 
=  D~1/2W1 

=  D~1/2d  =  d1/2  (8) 

Therefore  d1/  2  is  an  eigenvector  of  M  with  eigenvalue  1,  and  hence  \/k  A^  <  1 
[12].  Thus  Ai  =  1.  In  fact,  if  G  is  connected  (so  that  M  represents  an 
irreducible  and  aperiodic  Markov  chain)  then  \/k  >  1  |Afc|  <  1  =  Ai.  We  also 
have  Vi  =  p^y,  which  leads  to  fa  =  aj[  an<7  Vh  =  JdhT\ y  That  means  fa 
is  a  constant  vector,  while  fail)  =  , di 

Let  Pt(j\i)  be  the  probability  that  the  traveler  starts  walking  from  country 
i  and  appears  in  country  j  at  time  t,  then  it  follows  from  Eq.  3: 

Pt{j\i)  =  ejMf  =  efTA4^  =  (9) 

k 
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where  e*  is  a  vector  whose  entry  ej(fc)  =  Therefore,  if  G  is  connected, 
the  following  limit  holds,  regardless  of  the  initial  starting  point: 


Umf+ooPtUli)  =  faWM) 


dj  dj 


(10) 


The  first  eigenvector  (j) i  serves  as  the  stationary  distribution  of  the  random 
walk  M.  It  can  also  be  considered  a  density  estimate,  which  tells  us  of 
how  frequently  our  walker  passes  by  a  particular  country.  In  social  network 
terminology,  it  is  the  centrality  vector. 

5  Diffusion  Maps 

For  each  country  i,  we  can  imagine  the  diffusion  process  starts  with  an  initial 
distribution  po(j\i)  =  Sij.  After  t  steps,  this  distribution  diffuses  out  to  the 
neighborhood  of  i,  with  the  landscape  described  by  Pt(j\i)-  The  walker  is 
more  likely  to  end  up  in  states  close  to  i  than  those  far  away.  The  diffusion 
distance  D^(i,j)  can  be  measured  by  Eq.  1,  with  the  weight  function  u(k)  = 
A  which  normalize  the  distance  by  the  centrality  measure  of  each  node. 
Df(i,j)  can  be  seen  as  the  weighted  difference  between  the  two  distributions 
of  concentrations  after  t  steps  of  two  random  walks  starting  from  nodes  i  and 

j- 

We  also  define  diffusion  map  Tj  as  the  mapping  between  the  original  data 
space  onto  the  first  k  left  eigenvectors  of  M: 


V t{i )  =  A42  i/j2(i), . . . ,  A  *  VkW) 


(11) 


It  is  easily  verifiable  that  the  diffusion  distance  in  Eq.  1  is  equal  to  Eu¬ 
clidean  distance  in  the  diffusion  map  space: 
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n  /  k 


Dt(hi)  =  Y1  (  5^  xl(M)  -  i  d[ 

xki  ( V'fei (*)  -  Mi) )  xl2  ( Mi)  -  Mi) 


ki,k2 


E 


0fcl  (0^*fc2  (0 

di 


^  Afc!  (  (*)  -  V’fel  CO  )  ^fc2  (  0fc2  (*)  -  0fc2  (j) 

fcl,fe2 

di^kl{l)(t)k2{l) 


di 


E 

i 

=  Y1  xki  (Mi)  -  Mo) )  xl2  ( Mi)  -  Mi) 

ki,k2  ' 
n 

y  ]  (o^fe2  (o 

i 

=  Y1  A^i  ( ^  (®) -  O') )  AL  ( ^fc2  (*)  -  ^k2  o) )  ^fe, 


fcl,/C2 

K, 


=  ^f[MO-M) 

k  ' 

=  ||<P,(!)-<Pt(i)||2 


(12) 


Practically,  only  the  last  (k  —  1)  coordinates  are  to  be  considered  because 
■0i  is  a  constant  vector.  Additionally,  since  V/c  |  |  <=  1,  components  A tk'il’k(i) 

in  Eq.  11  corresponding  to  smaller  values  of  Xk  vanish  rapidly  as  t  increases, 
achieving  nonlinear  dimensionality  reduction. 


6  Experimental  Results 

We  present  several  examples  of  the  application  of  our  diffusion  maps  algo¬ 
rithm  on  geopolitical  databases.  Three-dimensional  visualizations  of  the  re- 
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suits  are  also  made  available  online  at  http:/ / www.es. yale.edu/homes/vision/zucker/embeddings.ht 


6.1  Geographical  map:  A  physical  perspective 

Fig.  3  provides  an  experiment  with  geographical  embedding  of  national  capi- 

r?. 

IJ 

tals  [5],  with  the  kernel  Wij  =  e_ioU  The  resulting  embedding  approximates 
global  positions. 

6.2  Intergovernmental  organization  (IGO)  membership 
pattern 

Inter-governmental  organizations  (IGO)  play  a  crucial  role  in  international 
relations.  Fig.  4  reveals  how  various  countries  are  positioned,  given  their 
IGO  memberships  [13]  in  the  year  2000.  The  diffusion  maps  were  derived 
using  the  correlation  of  joint  membership  as  the  kernel  function  [9].  The 
maps  show  that  IGO  membership  pattern  tends  to  correlate  with  regional 
geographical  positions. 


6.3  UN  vote  pattern:  de  Gaulle’s  France 

Using  the  Pearson  product  correlation  kernel  [9],  we  embed  the  UN  member 
nations  in  a  three-dimensional  space,  according  to  their  votes  in  the  UN  Gen¬ 
eral  Assembly  in  various  years.  Fig.  5  shows  the  embedding  of  the  network 
of  UN  Assembly  members  according  to  their  voting  patterns  at  various  time 
during  1957-1975.  These  visualizations  provide  us  with  a  novel  historical 
perspective. 

Additionally,  Fig.  6  plots  the  ratios  of  embedding  distance  in  the  period 
1965-2000: 


dist(FRN,EU*)  ,1  i  ,  r 

*  diam(EU*)  aS  tlle  blu6  lllle 

dist(UKG,EU*)  , ,  , 

*  diam(EU*)  aS  tlle  red  lllle 


dist(FRN,UKG) 

diam(EU*) 


as  the  green  line 
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Figure  3:  Geographical  embedding  of  national  capitals  in  3-dimensional  space, 
using  the  2nd,  3rd,  and  Ath  vectors  of  the  diffusion  map.  The  edge  weight 

r2. 

— 

function  is  defined  as  W,j  =  e  5s  where  r^  is  geographical  distance  between 
captitals  of  nations  i  and  j .  Figure  (a)  provides  a  top  down  view,  while  (b)- 
(i)  show  side  views  of  the  embedding  from  different  angles,  turning  from  west 
to  east  (counterclockwise) .  Several  countries  are  marked  with  colored  squares 
for  easy  identification:  ■  (USA,  UKG,  FRN,  BEL,  ISR),  U  (RUS,  CHN, 
POL,  HUN,  BLR),  U  (EGY,  SYR,  LEB,  SAU,  KUW). 
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Figure  4:  Diffusion  map  of  countries,  given  their  IGO  membership  in  2000, 
using  the  2nd,  3rd,  and  4th  vectors,  (a)  provides  a  top-down  perspective,  while 
(b)-(i)  show  side  views  from  different  angles,  in  counterclockwise  rotation. 
The  countries  are  manually  colored  according  to  their  geographical  locations, 
which  shows  again  that  IGO’s  aligning  influence  is  mostly  regional.  Leg¬ 
end  (with  respect  to  (a)):  Caribean  (dark  blue,  upper  left);  Central  &  South 
American  (medium  blue,  lower  left);  Western  European  (light  blue,  upper 
right);  former  Soviet  states  &  ISR  (yellow,  upper  right);  North  African  (light 
red,  middle  far  right);  African  (light  orange,  lower  right);  Middle  East  (dark 
orange,  middle  right);  USA  &  CAN  (dark  red,  middle). 
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(a)  1957 


(b)  1960  (c)  1963 


:  ri'-  • 


(d)  1965  (e)  1967 


(f)  1970 


(g)  1972 


(h)  1973 


(i)  1975 


Figure  5:  De  Gaulle’s  France:  Diffusion  maps  of  UN  voting  pattern  1957- 
1975.  Several  countries  are  marked  for  case  study  identification:  if  (USA), 
A  (UKG),  ★  (FRN),  U  (BEL,  LUX,  GFR),  ★  (RUS).  These  maps  show 
France  started  out  close  to  the  Allies  in  1957.  Then  in  1960,  France,  under 
de  Gaulle’s  presidency,  distanced  its ef  from  the  West.  The  70s  saw  France 
coming  back  toward  the  Western  fold,  once  de  Gaulle  had  left. 
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(a)  Diffusion  (b)  Distance  (c)  Hamming 

distance  of  PCA  distance 

embedding 


Figure  6:  Ratios  of  embedding  distances  between  FRN-EU*  (blue),  UKG-EU* 
(red),  FRN-UKG  (green)  in  1965-2000.  Here  EU*  is  defined  as  the  states 
of  the  European  Community,  excluding  FRN  &  UK.  Thes  plots  show  how 
relations  between  France,  UK  and  the  rest  of  the  Western  European  states 
changed  over  time,  with  France  standing  far  apart  during  the  60s,  and  coming 
back  to  the  fold  afterward. 

where  EU*  is  defined  as  the  states  of  the  European  Community,  excluding 
FRN  &  UK.  The  distances  and  diameters  are  calculated  from  diffusion  dis¬ 
tance,  distance  of  PCA  embedding,  and  Hamming  distance  of  the  VOTE  ma¬ 
trix.  The  plots  of  different  distance  measures  show  us  how  diffusion  method 
amplifies  the  connections  between  highly  connected  actors,  and  also  enhances 
separation  between  distant  parties. 

France’s  self-isolation  under  de  Gaulle’s  presidency  is  apparent  from  the 
diffusion  maps.  In  1957  (Fig.  5a),  France  (cyan  star,  upper  left  corner) 
was  close  to  the  USA,  UK,  Belgium,  Luxembourg  (blue  markers).  By  1959, 
France  under  Charles  de  Gaulle  began  to  withdraw  from  NATO  military 
commands  and  completed  that  process  in  1966.  Thus,  when  we  look  at  the 
maps  as  time  proceeds,  we  see  France  slowly  move  to  the  edge  of  the  (blue) 
Western  group  in  1960  (Fig.  5b),  gradually  edging  further  away  by  1963 
(Fig.  5c),  planting  itself  in  a  distant  position  from  that  of  the  West  in  1967 
(Fig.  5e).  The  distance  ratio  plot  in  Fig.  6a  shows  us  the  blue  line  (FRN- 
EU)  started  at  around  0.8,  the  green  line  (FRN-UKG)  reaching  its  peak  at 
9  in  1967-1968,  while  the  red  line  (UKG-EU)  lying  low  initially,  indicating 
France’s  isolated  position  from  that  of  the  Western  countries  (and  UKG) 
at  the  time.  After  de  Gaulle  left  office  in  1969,  we  see  the  blue  line  begin 
to  decline  steeply,  moving  in  tandem  with  the  red  line,  implying  a  reverse 
course  in  France’  foreign  policy,  gradually  edging  closer  to  that  of  the  rest 
of  West.  Indeed,  Fig.  5f  shows  France  (cyan  star,  bottom  left)  moving  back 
toward  integration  in  NATO,  its  position  in  1972-1973  (Fig.  5g-5h)  got  closer 
and  closer  to  that  of  UKG  (blue  triangle,  top  left)  (FRN  opened  up  from  its 
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self-isolation,  allowing  UKG  to  join  EC  in  1973).  By  1975  (Fig.  5i)  France 
again  stood  close  to  the  Western  bloc.  In  the  80s  until  the  end  of  the  Cold 
War,  the  distance  ratios  FRN-EU  and  UKG-EU  (blue  &  red  lines,  Fig.  6a) 
ascended  slightly,  due  to  the  absorption  of  new  members  into  the  EU.  The 
green  line  (FRNK-UKG),  however,  remains  low  throughout  the  80s,  showing 
how  close  FRN  and  UKG’s  policies  were  to  each  other  during  that  period. 

The  diffusion  maps  reveal  the  inherently  low  dimensional  structure  among 
countries,  in  agreement  with  prior  analysis  [1,7].  It  is  also  apparent  from 
Fig.  6  that  PCA  fails  to  discover  a  pattern  in  the  movements  of  countries 
in  the  network,  while  diffusion  distance  tmeovers  the  same  pattern  as  the 
simple  Hamming  distance.  The  spectrum  given  by  PCA  decays  very  slowly: 
it  requires  20-30  dimensions  to  describe  all  variances  in  the  voting  data. 
Diffusion  method,  on  the  other  hand,  requires  only  5-7  dimensions  to  describe 
the  voting  patterns  [7].  The  diffusion  method  performs  better  in  amplifying 
significant  events  in  its  distance  plot  (e.g.  the  period  from  1957-1967  in  which 
France  isolated  itself).  However,  the  diffusion  distance  in  Fig.  6  is  computed 
from  only  5  dimensions,  whereas  the  Hamming  distance  is  the  aggregated 
result  of  votes  on  all  UN  resolutions  in  a  particular  year. 


6.4  UN  vote  pattern:  The  collapse  of  the  Soviet  Union 


Fig.  7  shows  the  maps  of  nations  according  to  their  UN  voting  patterns  at 
various  time  during  1989-2005.  The  embedded  positions  are  computed  by  our 
diffusion  method  such  that  countries  are  placed  closer  to  each  other  if  they 
voted  similarly,  and  far  apart  if  they  did  not.  Fig.  8  compares  3  distance 
metrics:  (a)  diffusion  distance  by  our  method  (which  shall  be  defined  in 
more  details  later  in  this  article),  (b)  PCA  embedded  distance  (Euclidean 
distance  between  data  points  embedded  by  a  Principal  Component  Analysis 
projection),  and  (c)  Hamming  distance  (normalized  number  of  resolutions 
that  countries  voted  different  from  each  other.)  Each  subfigure  plots  the 
ratios  of  embedding  distances  in  the  period  1965-2000: 


dist(USA.EU)  t] 
diam(EU)  aS  tJle 

dist(RUS,EU) 

*  diam(EU)  aS  tllg 

dist(POL,EU)  ,i 
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where  EU  is  defined  as 


blue  line 
red  line 
green  line 

the  states  of  the  European  Community. 
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(d)  1992  (e)  1993  (f)  1995 
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Figure  7:  The  collapse  of  the  Soviet  Union:  Diffusion  maps  of  UN  voting 
pattern  1989-2005.  Several  countries  are  marked  for  case  study  identification: 
★  (USA),  U  (UKG,  FRN,  BEL,  LUX),  ★  (RUS),  ♦  (YUG),  ►  (UKR, 
BLR),  U  (POL,  HUN),  •  (CRN). 


(a)  Diffusion  (b)  Distance  (c)  Hamming 
distance  of  PCA  distance 

embedding 


Figure  8:  Ratios  of  embedding  distances  between  USA-EU  (blue),  RUS-EU 
(red),  POL-EU  (green)  in  1965-2000.  Here  EU  is  defined  as  the  states  of  the 
European  Community.  Thes  plots  show  how  relations  between  USA,  USSR, 
Poland  and  the  Western  European  states  changed  over  time,  with  Poland 
tailing  the  USSR  until  1989,  after  which  it  was  completely  aligned  with  the 
West. 
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Figure  9:  The  disintegration  of  the  Soviet  Union  (1988-1992):  The  evolution 
of  2-dimensional  diffusion  maps  of  nations  according  to  their  voting  patterns 
in  the  UN  Assembly.  Each  dot  denotes  the  global  position  of  a  country  in  a 
particular  year.  Special  markers  are  drawn  to  denote:  ★  (USA),  A  (UKG), 
★  (RUS),  ■  (POL),  •  (CHN).  Several  lines  are  also  plotted  connecting  the 
“paths”  of  these  countries  over  time.  Note  how  USA  and  UKG  stayed  rela¬ 
tively  steady  at  their  positions,  while  the  paths  of  Communist  states  started 
to  diverge  since  1989.  POL  was  the  first  to  move  out  of  the  camp  in  1990, 
followed  by  RUS,  whereas  CHN  remained  in  their  original  position  throughout 
the  whole  period. 
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The  1989  diffusion  map  is  polarized  with  the  Western  bloc  (blue)  on  the 
left  and  the  Eastern  bloc  (red)  on  the  right  of  Fig.  7a.  The  distance  ratio  plots 
in  Fig.  8a  clearly  shows  the  green  line  (POL-EU)  trailing  the  red  line  (RUS- 
EU)  prior  to  1989,  indicating  Poland’s  policy  completely  dominated  by  that 
of  the  Soviet  Union.  However,  in  1990  (Fig.  7a),  Poland  and  Hungary  (red 
squares)  switched  to  the  left,  followed  quickly  by  Czecholovakia,  Bulgaria, 
and  then  the  three  newly  independent  Baltic  republics.  Fig.  8a  clearly  reveals 
a  break  between  the  green  line  and  the  red  line  from  1989,  showing  different 
trends  in  Poland  and  Russia’s  policies  from  then  on.  By  1991  (Fig.  7b), 
Russia  (red  star),  Belarus,  and  Ukraine  (2  red  triangles)  followed  suit,  as  they 
moved  toward  the  center.  In  1992,  after  the  Soviet  bloc  fully  disintegrated 
(Fig.  7d),  its  members  had  all  migrated  to  the  left,  with  Ukraine  and  Belarus 
hanging  in  the  middle,  leaving  China  (red  circle)  on  the  right,  close  to  the 
Arabs  and  the  third  world.  Figs.  7d-  7f  depicts  Russia’s  effort  to  get  close 
to  the  West,  as  Yeltsin  vied  for  Western  support  for  admission  to  NATO 
or  the  EU.  The  downward  trend  of  the  red  line  during  1992-1995  in  Fig.  8a 
indicates  Russia’s  aborted  attempt  to  get  close  to  the  EU.  After  Yeltsin’s 
second  election  in  1996  and  his  failure  to  court  the  West  (Fig.  7g),  Russia 
moved  to  the  right  of  the  map.  Fig.  8a  records  a  sharp  ascent  of  the  red 
line  after  1996,  implying  Russia’s  abandonment  of  its  westward  movement. 
Further  shift  eastward  occurred  after  Putin  replaced  Yeltsin  in  2000  (Fig.  7h), 
as  Russia  switched  to  the  right,  getting  close  to  China  again. 

The  collapse  is  even  more  evident  in  Fig.  9,  which  provides  a  time- 
evolution  of  the  event  by  stringing  the  2-dimensional  structures  of  the  align¬ 
ments  in  Fig.  7  along  the  time  dimension.  It  is  apparent  from  the  figure: 

•  USA  and  UKG  stood  close  to  each  other  in  the  2-dimensional  align¬ 
ment,  and  their  distance  remain  relatively  stable  throughout  the  5-year 
period. 

•  The  break-up  of  the  Soviet  Union  is  shown  in  the  diverging  lines  of  RUS, 
POL  and  CHN.  The  Union  stayed  intact  until  1990,  when  POL  moved 
away,  toward  the  other  side  of  the  map.  In  1991,  RUS  inched  apart 
from  CHN  and  the  third-world  countries,  and  then  moved  completely 
out  by  1992. 

For  further  analysis,  we  consider  the  group  of  Communist  countries  in  the 
years  1989-1991.  Fig.  10  shows  the  diffusion  distances  among  these  countries 
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(a)  1989 


(b)  1990 


(c)  1991 


1 

i 


Figure  10:  Diffusion  distances  among  the  countries  in  the  Communist  Bloc 
(POL,  HUN,  CZE,  ALB,  YUG,  BUL,  ROM,  RUS,  UKR,  BLR)  in  1989- 
1991.  The  colors  denote  distance  value  from  low  (cool,  blue  color)  to  high 
(hot,  red  color). 

in  1989-1991.  The  group  was  tight  in  1989  and  quickly  disintegrated  in  1990 
and  1991,  as  the  diffusion  distances  suddenly  spiked  up  in  these  two  years. 

7  Themes  across  Resolutions 

We  now  switch  emphasis  to  inferring  implicit  structure  among  resolutions. 
Since  voting  patterns  are  responsible  for  the  global  embedding,  further  in¬ 
sight  can  be  obtained  by  looking  at  those  resolutions  that  have  the  highest 
variance  among  clusters  of  countries.  In  essence  we  are  asking:  among  nearby 
countries,  which  topics  are  most  controversial;  i.e. ,  on  which  neighbors  vote 
differently.  We  focus,  in  particular,  on  the  Soviet  bloc  of  Eastern  European 
countries. 

Numerical  values  are  assigned  to  votes: 


against  - 

+  -1 

abstain  - 

■+  0 

for 

+  +1 

so  we  can  compute  the  variances  of  the  votes  of  the  Eastern  Bloc  for  every 
UN  resolution  in  the  three  years  around  the  breakup  of  the  Bloc. 

Table  1  shows  a  topical  breakdown  of  the  20  highest- variance  resolutions 
among  these  countries  votes  in  1989-1991.  During  the  first  year  most  of  the 
attention  remained  focused  on  old  Cold  War  issues  and  matters  of  develop¬ 
ment,  anti- colonialism,  and  human  rights  in  the  global  south  on  which  the 
Soviet  bloc  had  commonly  sided  with  less  developed  countries  against  the 
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1989 

1990 

1991 

Middle  East 

2 

7 

6 

Weapon  Nonproliferation 

2 

6 

5 

Anti-Apartheid  &  Human  Rights 

6 

2 

2 

Territory  &  Sovereignty 

5 

5 

6 

Others 

5 

0 

1 

Table  1:  Topical  breakdown  of  the  20  highest-variance  resolutions  according 
to  the  votes  of  Eastern  Bloc  members  (POL,  HUN,  CZE,  ALB,  YUG,  BUL, 
ROM,  RUS,  UKR,  BLR)  during  1989-1991. 

developed  north  and  west.  But  even  as  soon  as  1990  and  then  1991  those 
divisive  issues  had  faded,  and  in  their  place  Middle  Eastern  issues,  especially 
focusing  on  Israel  and  the  Palestinians  became  dominant.  On  those  issues  the 
US  and  Israel  were  in  a  minority  even  among  other  western  states.  Conse¬ 
quently  they  became,  and  have  remained,  apart  from  the  Assembly  majority 
as  they  had  never  been  before. 

It  is  clear  from  this  example  that  there  are  currents  in  the  resolutions.  Our 
next  goal  is  to  discover  them  automatically.  In  order  to  not  have  preconceived 
notions,  we  adapt  a  hierarchical  clustering  algorithm  and  an  eigenfunction 
summary  method  next. 

8  Building  hierarchical  clustering  trees 

We  now  seek  to  organize  the  resolutions  according  to  how  countries  voted  on 
them,  with  the  goal  of  uncovering  themes  that  summarize  them.  Given  the 
lack  of  a  prior  on  themes  among  resolutions  -  how  many  there  are  or,  even, 
whether  any  exist  -  we  adapt  a  hierarical  clustering  algorithm. 

For  each  cluster  in  the  hierarchy,  we  seek  a  set  of  “summary  questions” 
that  best  approximate  large  groups  of  questions  underlying  the  embeddings. 
This  has  two  advantages:  (i)  it  reduces  the  dimension  of  the  data  set;  and  (ii) 
if  the  summary  questions  are  combinations  of  small  numbers  of  questions, 
they  are  simple  to  interpret.  This  latter  point  is  an  advantage  to  the  political 
scientists.  We  stress  that  our  approach  is  in  contrast  to  factor  analysis,  which 
leads  to  factors  that  are  linear  combinations  of  all  questions. 

Any  pair  of  resolutions  are  related  if  they  are  either  highly  correlated  or 
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highly  anticorrelated.  For  example,  during  the  Cold  War  period,  a  UN  res¬ 
olution  condemning  Israel  in  Middle  East  issues  will  most  likely  be  rejected 
by  the  West  and  supported  by  the  Arabs;  however,  another  UN  resolution 
in  support  of  Israel  would  lead  to  the  exact  opposite  voting  pattern.  There¬ 
fore,  we  study  the  absolute  value  of  data  correlation  as  a  topical  similarity 
function. 

More  formally,  this  leads  to  a  relatively  standard  objective  function  that 
only  depends  on  dot  products.  It  can  be  modified  using  the  kernel  trick  to 
incorporate  non-linearities,  in  particular  those  that  arise  with  our  diffusion 
kernel. 

We  treat  each  resolution  as  a  vector  of  responses  q{  normalized  so 

Y^QiU)  =  ° 

3 

M  =  i 

We  denote  Q  =  {q1, . . . ,  qn},  the  set  of  votes  to  all  resolutions. 

On  the  way  to  designing  an  objective  function,  we  first  seek  to  find  a  set 
of  “summary  questions”  S  =  {si, . . . ,  sfc}  and  a  clustering  C  =  {ci, . . . ,  c*,} 
of  questions  with  summary  questions  with  the  following  properties: 

k 

U  Ci  =  Q  (13) 

i= 1 

CiHCj  =  0,i=£  j  (14) 

11*11  =  1  (15) 

Equations  (13)  and  (14)  make  sure  that  each  question  is  assigned  to  a 
single  cluster.  We  now  want  to  maximize  the  similarity  between  each  question 
and  the  summary  question  it  is  assigned  to.  The  objective  function  we  seek 
to  maximize  is  dehnied  as: 


k 

<p(c,s)  =  j2J2  i<«»r 

i=  1  qj&ci 

In  the  bioinformatics  community  this  objective  is  called  the  diametric 
clustering  objective  function  [4].  This  has  an  equivalent  metric  clustering 
minimization  problem.  Using  the  fact  that  |(qr  -|Sj)|2  <  1 
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arg  max  (j>(C ,  S )  =  arg  mirijn  -  <f>(C,  5)} 


k 


=  arg  min  ^  ^ 


i= 1  <JjGCi 


where  d(v,w)  =  \J  1  —  |(v|iu)|2 


d(-,  •)  is  a  pseudometric,  which  is  to  say 

1.  d(v,  v)  =  0 

2.  d(v,  w )  =  d(w,  v ) 

3.  d(u,v)  +  d(v,w)  >  d(u,w) 

1  and  2  are  trivial.  Proof  of  3  is  technical,  and  is  omitted  for  space 
reasons. 

The  maximization  version  of  this  problem  suggests  one  simple  heuristic, 
while  the  minimization  problem  suggests  another.  The  first  is  a  modification 
of  Lloyd’s  algorithm. 

procedure  ModifiedLloyd({<7i,  . . . ,  qn }) 
cluster  =  initialclusteringQ 
while  0oid  ^  cj)new  do 


4* old  finew 

for  i  =  1  to  k  do 


new 


^  b  fell  '  '  '  1 9cm] 
>  Vi  is  largest  left  sing.  vect. 


V  =  concat(q  e  c,) 

VI  =  SVD{V) 


end  for 
end  while 

for  j  =  1  to  n  do 

put  q3  in  the  cluster  that  maximizes  \{vi\qj)\ 

end  for 

recompute  (j)new 

end  procedure 

This  algorithm  increases  the  objective  function  (j)  at  each  stage.  In  fact, 
each  for  loop  increases  q i. 
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It  is  instructive  to  consider  how  this  might  be  proved.  The  second  for 
loop  is  straightforward,  as  each  question  is  assigned  to  the  cluster  that  max¬ 
imizes  the  objective.  Therefore  if  any  questions  change  cluster,  the  objective 
function  will  increase. 

Let  V  be  defined  as  above.  Then  V  =  U  *  D  *  WT  where  U  and  V  are 
unitary  and  D  is  a  diagonal  matrix  of  singular  vectors.  Then 

E  i(9»r  =  iis7vii2 

<7j6  Ci 

=  J2DKui\s) 

i 

where  Ui  are  the  columns  of  U.  This  is  maximized  by  setting  s  to  be  equal 
to  the  largest  singular  vector  U\ 

Therefore  each  stage  of  the  algorithm  increases  (j).  Since  there  are  a  finite 
number  of  clusterings,  and  hence  values  for  0  and  each  stage  of  the  algorithm 
increases  (j),  it  converges,  though  possibly  not  to  the  global  optimum. 

8.1  Toward  Thematic  Hierarical  Clustering 

Although  Lloyd’s  algorithm  guarantees  a  local  maximum  in  the  objective 
function,  for  our  application  we  seek  a  related  -  but  in  a  local  sense,  slightly 
different  -  condition:  we  guarantee  that  the  absolute  correlation  distance  can¬ 
not  exceed  a  threshold.  Guaranteeing  this  condition  was  deemed  a  necessity 
by  the  political  scientists,  and  leads  to  a  variation  on  the  above  algorithm. 

We  start  with  n  individiual  singleton  clusters  of  entities  E  and  a  data 
matrix  Dofm  countries  (rows)  and  n  resolutions  (columns),  such  one  shown 
in  Table  2.  We  also  have  a  correlation  threshold  9  e  (0, 1)  and  a  cooldown 
ratio  a  G  (0, 1).  We  repeatedly  iterate  through  the  following  steps,  merging 
clusters  until  only  one  remains: 

procedure  GreedyCluster(D,  9) 
unallocated  =  D 
for  c  in  unallocated  do 

remove  c  from  unallocated 
for  q  in  unallocated  do 

if  abs(corr(c,  q)  <  9)  then 
remove  q  from  unallocated 
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#3508 

#3510 

#3515 

#3538 

#3570 

USA 

-1 

-1 

-1 

-1 

-1 

UKG 

-1 

-1 

-1 

0 

0 

RUS 

1 

1 

1 

1 

1 

POL 

0 

0 

0 

0 

0 

CHN 

1 

1 

0 

1 

1 

Table  2:  An  excerpt  from  the  UN  voting  data  [16]  of  5  countries  (USA, 
UKG,  RUS,  POL,  CHN)  in  1990  on  5  issues,  denoted  by  their  roll  call  id’s 
(RCID):  #3508  (Dissemination  of  information  on  decolonization)  #3510 
(Observer  status  of  national  liberation  movements  recognized  by  the  OAU 
and/or  by  the  League  of  Arab  States)  #3515  (Cessation  of  all  nuclear  test 
explosions)  #3538  (Calls  upon  Israel  to  become  party  to  the  Treaty  on  the 
Non-Proliferation  of  Nuclear  Weapons)  #3570  (Status  of  the  International 
Convention  on  the  Suppression  and  Punishment  of  the  crime  of  Apartheid). 
The  votes  are  represented  by  numbers:  1  (Yes),  0  (Abstain),  -1  (No). 


assign  q  to  cluster  c 

end  if 
end  for 
end  for 

reassign  questions  to  most  correlated  cluster  center 
return  clusters 
end  procedure 

procedure  GreedyTree(D,  0,  a) 
while  numclusters  >  1  do 

clusters  =  GreedyCluster(D,  6) 
set  D  to  largest  singular  vector  of  each  cluster 
0  =  Oa 
end  while 
end  procedure 

Performance  is  very  similiar  to  the  Lloyd  algorithm,  which  could  in  effect 
be  inserted  into  the  first  procedure. 
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Figure  11:  Clustering  result  of  UN  resolutions  during  the  period  1998-2002. 
Two  individual  clusters  are  marked  with  •  and  ■  symbols  for  demonstration. 

8.2  Results  on  UN  Resolutions 

We  applied  the  clustering  algrithm  on  the  set  of  UN  resolutions  during  the 
period  1998-2002  [16],  with  9  =  0.95  and  a  =  0.8.  Fig.  11  shows  the  clustering 
hierarchy  with  two  clusters  •  and  .  The  resolutions  in  cluster  •  pertain  only 
to  Middle  East-related  resolutions,  while  cluster  1 1  comprises  resolutions  from 
two  topics  (Human  Rights  and  Nuclear  Disarmaments). 

We  take  a  more  detailed  look  at  the  resolutions  during  the  breakup  of  the 
Soviet  bloc  of  countries  in  Figs.  12  -  14. 

9  Summary 

In  this  project  we  developed  a  diffusion-based  approach  to  embedding  high¬ 
dimensional  UN  voting  data  and  showed  how  to  cluster  the  resolutions  “driv¬ 
ing”  these  embeddings.  Organization  among  countries  revealed  political  rela¬ 
tionship,  and  cluster  analysis  revealed  thematic  threads  running  across  time. 
In  effect  we  showed  that  much  of  the  historical  record  can  be  “read  out”  from 
UN  voting  patterns. 

10  Publications  Arising  from  this  Project 

•  Liberty,  E.,  and  Zucker,  S.W.,  The  Mailman  algorithm  for  matrix  vec¬ 
tor  multiplication,  Information  Processing  Letters,  2009,  109(3),  179  - 
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Figure  12:  Thematic  clustering  of  UN  Resolutions  1989.  The  ■  cluster  is 
about  Middle  East  issues,  while  the  •  is  about  disarmament  and  nuclear 
weapons.  The  variance  in  voting  patters  across  Eastern  Bloc  countries  on 
these  issues  is  virtually  0. 


Figure  13:  Thematic  clustering  of  UN  Resolutions  1990.  The  variance  across 
clusters  starts  to  increase,  indicating  political  change.  The  cluster  •  on  Mid¬ 
dle  East  issues  is  growing  larger,  while  others  (e.g.  ■  remain  fixed  on  nuclear 
weapons  issues. 
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Figure  14:  Thematic  clustering  of  UN  Resolutions  1991.  Again  the  Middle 
East  •  cluster  remains  while  the  nuclear  weapons  cluster  ■  enlarges  to  include 
economic  and  other  testing  issues. 

182. 

•  Keller,  Y.,  Lafon,  S.,  Coifman,  R.,  and  Zucker,  S.W.  Audio-Visual 
Group  Recognition  by  diffusion  maps,  IEEE  Transactions  On  Signal 
Processing,  2010,  58(1),  403  -  413. 

•  Le,  Minh  Tam,  Sweeney,  J.,  Liberty,  E.  and  Zucker,  S.W.,  Similarity 
Kernels  via  Bi-Clustering  for  Conventional  Intergovernmental  Organi¬ 
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Security  Analytics  (PAISA-10),  IEEE  Intelligence  and  Security  Infor¬ 
matics  Conference,  Vancouver,  26  May,  2010. 
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